Google data centers are the large data center facilities Google uses to provide their services, which combine large drives, computer nodes organized in aisles of racks, internal and external networking, environmental controls (mainly cooling and humidification control), and operations software (especially as concerns load balancing and fault tolerance).
There is no official data on how many servers are in Google data centers, but Gartner estimated in a July 2016 report that Google at the time had 2.5 million servers. This number is changing as the company expands capacity and refreshes its hardware.
+ !Continent !Location !Geo !Products Location !Cloud Location !Timeline !Description | |||||||
North America | Arcola (VA), USA | Loudoun County | N. Virginia (us-east4) | 2017 - announced | |||
North America | Atlanta | Douglas County | - | 2003 - launched | 350 employees | ||
South America | Cerrillos, Santiago, Chile | - | Santiago (southamerica-west1) | 2020 - announced 2021 - launched | |||
Asia | Changhua County | Changhua County | Taiwan
(asia-east1) | 2011 - announced 2013 - launched | 60 employees | ||
North America | Clarksville (TN), USA | Montgomery County | - | 2015 - announced | |||
North America | Columbus (OH), USA | - | Columbus (us-east5) | 2022 - launched | |||
North America | Council Bluffs (IA), USA | Council Bluffs | 2007 - announced
2009 - completed first phase completed 2012 and 2015 - expanded | 130 employees | |||
North America | Council Bluffs (IA), USA | Iowa (us-central1) | |||||
Asia | Delhi | - | Delhi (asia-south2) | 2020 - announced 2021 - launched | |||
Middle East | Doha | - | Doha (me-central1) | 2023 - launched | |||
Europe | Dublin, Ireland | Dublin | - | 2011 - announced 2012 - launched | 150 employees | ||
Europe | Eemshaven | Eemshaven | Netherlands (europe-west4) | 2014 - announced
2016 - launched
2018, 2019 - expansion | 200 employees | ||
Europe | Frankfurt | - | Frankfurt (europe-west3) | 2022 - expanded | |||
Europe | Fredericia | Fredericia | - | 2018 - announced 2020 - launched | €600M building costs | ||
Europe | Ghlin | Saint-Ghislain | Belgium (europe-west1) | 2007 - announced 2010 - launched | 12 employees | ||
Europe | Hamina | Hamina | Finland
(europe-north1) | 2009 - announced
2011 - first phase completed
2022 - expansion | 6 buildings, 400 employees | ||
North America | Henderson (NV), USA | Henderson | Las Vegas (us-west4) | 2019 - announced
2020 - launched | 64-acres; $1.2B building costs | ||
Asia | Hong Kong | - | Hong Kong (asia-east2) | 2017 - announced 2018 - launched | |||
Asia | Inzai | Inzai | - | 2023 - launched | |||
Asia | Jakarta | - | Jakarta (asia-southeast2) | 2020 - launched | |||
Asia | Koto-Ku, Tokyo, Japan | - | Tokyo
(asia-northeast1) | 2016 - launched | |||
North America | Leesburg (VA), USA | Loudoun County | N. Virginia (us-east4) | 2017 - announced | |||
North America | Lenoir (NC), USA | Lenoir | - | 2007 - announced
2009 - launched | over 110 employees | ||
Asia | Lok Yang Way, Pioneer, Singapore | Singapore | Singapore (asia-southeast1) | 2022 - launched | |||
Europe | London | - | London
(europe-west2) | 2017 - launched | |||
North America | Los Angeles | - | Los Angeles (us-west2) | ||||
Europe | Madrid | - | Madrid (europe-southwest1) | 2022 - launched | |||
Pacific | Melbourne | - | Melbourne
(australia-southeast2) | 2021 - launched | |||
Europe | Middenmeer | Middenmeer | Netherlands (europe-west4) | 2019 - announced | |||
North America | Midlothian (TX), USA | Midlothian | Dallas (us-south1) | 2019 - announced 2022 - launched | 375-acres; $600M building costs | ||
Europe | Milan | - | Milan (europe-west8) | 2022 - launched | |||
North America | Moncks Corner (SC), USA | Berkeley County | South Carolina (us-east1) | 2007 - launched
2013 - expanded | 150 employees | ||
North America | Montreal | - | Montréal (northamerica-northeast1) | 2018 - launched | 62.4-hectares; $600M building costs | ||
Asia | Mumbai | - | Mumbai (asia-south1) | 2017 - launched | |||
North America | New Albany (OH), USA | New Albany | - | 2019 - announced | 400-acres; $600M building costs | ||
Asia | Osaka | - | Osaka
(asia-northeast2) | 2019 - launched | |||
South America | Osasco | - | São Paulo (southamerica-east1) | 2017 - launched | |||
North America | Papillion (NE), USA | Papillion | - | 2019 - announced | 275-acres; $600M building costs | ||
Europe | Paris | - | Paris (europe-west9) | 2022 - launched | |||
North America | Pryor Creek (OK), USA | Mayes County | - | 2007 - announced
2012 - expanded | over 400 employees, land at MidAmerica Industrial Park | ||
South America | Quilicura | Quilicura | - | 2012 - announced 2015 - launched | up to 20 employees expected. A million dollar investment plan to increase capacity at Quilicura was announced in 2018. | ||
North America | Reno (NV), USA | Storey County | - | 2017 - 1,210 acres of land bought in the Tahoe Reno Industrial Center
2018 - announced | |||
North America | Salt Lake City (UT), USA | - | Salt Lake City (us-west3) | 2020 - launched | |||
Asia | Seoul | - | Seoul
(asia-northeast3) | 2020 - launched | |||
Pacific | Sydney | - | Sydney
(australia-southeast1) | 2017 - launched | |||
Middle East | Tel Aviv | - | Tel Aviv (me-west1) | 2022 - launched | |||
North America | The Dalles (OR), USA | The Dalles | Oregon (us-west1) | 2006 - launched | 80 full-time employees | ||
North America | Toronto | - | Toronto (northamerica-northeast2) | 2021 - launched | |||
Europe | Turin | - | Turin (europe-west12) | 2023 - launched | |||
South America | Vinhedo | São Paulo (southamerica-east1) | |||||
Europe | Warsaw | - | Warsaw (europe-central2) | 2019 - announced 2021 - launched | |||
Asia | Wenya | Singapore | Singapore (asia-southeast1) | 2011 - announced
2013 - launched
2015 - expanded | |||
North America | Widows Creek (Bridgeport) (AL), USA | Jackson County | - | 2018 - broke ground | |||
Europe | Zürich, Switzerland | - | Zurich (europe-west6) | 2018 - announced
![]() | |||
Europe | Austria | 2022 - announced | |||||
Europe | Berlin | Berlin (europe-west10) | 2021 - announced 2023 August - launched | ||||
Middle East | Dammam | 2021 - announced | |||||
Europe | Athens | 2022 - announced | |||||
North America | Kansas City, Missouri | 2019 - announced | |||||
Middle East | Kuwait | 2023 - announced | |||||
Asia | Malaysia | 2022 - announced | |||||
Pacific | Auckland, New Zealand | 2022 - announced | |||||
Europe | Oslo | 2022 - announced | |||||
North America | Querétaro, Mexico | 2022 - announced | |||||
Africa | Johannesburg, South Africa | Johannesburg (africa-south1) | 2022 - announced2024 - launched | ||||
Europe | Sweden | 2022 - announced | |||||
Asia | Tainan City | - | Taiwan
(asia-east1) | 2019 September - announced | |||
Asia | Thailand | 2022 - announced | |||||
Asia | Yunlin County | - | Taiwan (asia-east1) | 2020 September - announced | |||
North America | Mesa (AZ), USA | 2023 - construction started | |||||
Europe | Waltham Cross | 2024 January - announced | |||||
South America | Canelones, Uruguay | 2024 - construction started 2026 - inauguration expected |
At the time, on average, a single search query read ~100 Megabyte of data, and consumed CPU cycles. During peak time, Google served ~1000 queries per second. To handle this peak load, they built a compute cluster with ~15,000 commodity-class PCs instead of expensive supercomputer hardware to save money. To make up for the lower hardware reliability, they wrote Fault tolerance software.
The structure of the cluster consists of five parts. Central Google Web servers (GWS) face the public Internet. Upon receiving a user request, the Google Web server communicates with a spell checker, an advertisement server, many index servers, many document servers. Each of the four parts responds to a part of the request, and the GWS assembles their responses and serves the final response to the user.
The raw documents were ~100 TB, and the index files were ~10 TB. The index files are sharded, and each shard is served by a "pool" of index servers. Similarly, the raw documents are also sharded. Each query to the index file results in a list of document IDs, which are then sent to the document servers to retrieve the title and the keyword-in-context snippets.
There were several CPU generations in use, ranging from single-processor 533 MHz Celeron-based servers to dual 1.4 GHz Intel Pentium III. Each server contained one or more hard drives, 80 Gigabyte each. Index servers have less disk space than document servers. Each rack had two Network switch, one per side. The servers on each side interconnected via a 100-Mbps. Each switch had a ~250 MB/sec uplink to a central switch that connected to all racks.
The design objectives include:
Due to the massive parallelism, scaling up hardware scales up the thoroughput linearly, i.e. doubling the compute cluster doubles the number of queries servable per second.
The cluster is made of server racks at 2 configurations: 40 x Rack unit per side with 2 sides, or 20 x 2u per side with 2 sides. The power consumption is 10 kW per rack, at a density of 400 W/ft^2, consuming 10 Kilowatt-hour per month, costing $1,500 per month.
The customization goal is to purchase CPU generations that offer the best performance per dollar, not absolute performance. How this is measured is unclear, but it is likely to incorporate running costs of the entire server, and CPU power consumption could be a significant factor.
According to Google, their global data center operation electrical power ranges between 500 and 681 megawatts. The combined processing power of these servers might have reached from 20 to 100 FLOPS in 2008. Google Surpasses Supercomputer Community, Unnoticed? , May 20, 2008.
In order to run such a large network, with direct connections to as many ISPs as possible at the lowest possible cost, Google has a very open peering policy.
From this site, we can see that the Google network can be accessed from 67 public exchange points and 69 different locations across the world. As of May 2012, Google had 882 Gbit/s of public connectivity (not counting private peering agreements that Google has with the largest ISPs). This public network is used to distribute content to Google users as well as to crawl the internet to build its search indexes. The private side of the network is a secret, but a recent disclosure from Google indicate that they use custom built high-radix switch-routers (with a capacity of 128 × 10 Gigabit Ethernet port) for the wide area network. Running no less than two routers per datacenter (for redundancy) we can conclude that the Google network scales in the terabit per second range (with two fully loaded routers the bi-sectional bandwidth amount to 1,280 Gbit/s).
These custom switch-routers are connected to DWDM devices to interconnect data centers and point of presences (PoP) via dark fiber.
From a datacenter view, the network starts at the rack level, where 19-inch racks are custom-made and contain 40 to 80 servers (20 to 40 1Rack unit servers on either side, while new servers are 2U rackmount systems. Web Search for a Planet: The Google Cluster Architecture (Luiz André Barroso, Jeffrey Dean, Urs Hölzle) Each rack has an Ethernet switch). Servers are connected via a 1 Gbit/s Ethernet link to the top of rack switch (TOR). TOR switches are then connected to a gigabit cluster switch using multiple gigabit or ten gigabit uplinks. The cluster switches themselves are interconnected and form the datacenter interconnect fabric (most likely using a dragonfly design rather than a classic butterfly or flattened butterfly layout Denis Abt High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities).
From an operation standpoint, when a client computer attempts to connect to Google, several DNS servers resolve www.google.com into multiple IP addresses via Round-robin DNS policy. Furthermore, this acts as the first level of load balancing and directs the client to different Google clusters. A Google cluster has thousands of servers, and once the client has connected to the server additional load balancing is done to send the queries to the least loaded web server. This makes Google one of the largest and most complex content delivery networks.
Google has numerous data centers scattered around the world. At least 12 significant Google data center installations are located in the United States. The largest known centers are located in The Dalles, Oregon; Atlanta; Reston, Virginia; Lenoir, North Carolina; and Moncks Corner, South Carolina. In Europe, the largest known centers are in Eemshaven and Groningen in the Netherlands and Mons, Belgium. Google's Oceania Data Center is located in Sydney, Australia.
Google halted work on the barges in late 2013 and began selling off the barges in 2014.
The software that runs the Google infrastructure includes:
Google has developed several abstractions which it uses for storing most of its data:
To lessen the effects of unavoidable hardware failure, software is designed to be fault tolerant. Thus, when a system goes down, data is still available on other servers, which increases reliability.
The index is partitioned by document IDs into many pieces called shards. Each shard is replicated onto multiple servers. Initially, the index was being served from hard disk drives, as is done in traditional information retrieval (IR) systems. Google dealt with the increasing query volume by increasing number of replicas of each shard and thus increasing number of servers. Soon they found that they had enough servers to keep a copy of the whole index in main memory (although with low replication or no replication at all), and in early 2001 Google switched to an in-memory index system. This switch "radically changed many design parameters" of their search system, and allowed for a significant increase in throughput and a large decrease in latency of queries.
In June 2010, Google rolled out a next-generation indexing and serving system called "Caffeine" which can continuously crawl and update the search index. Previously, Google updated its search index in batches using a series of MapReduce jobs. The index was separated into several layers, some of which were updated faster than the others, and the main layer wouldn't be updated for as long as two weeks. With Caffeine, the entire index is updated incrementally on a continuous basis. Later Google revealed a distributed data processing system called "Percolator"Daniel Peng, Frank Dabek. (2010). Large-scale Incremental Processing Using Distributed Transactions and Notifications. Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation. which is said to be the basis of Caffeine indexing system.The Register. Google Caffeine jolts worldwide search machineThe Register. Google Percolator – global search jolt sans MapReduce comedown
In December 2016, Google announced that—starting in 2017—it would purchase enough renewable energy to match 100% of the energy usage of its data centers and offices. The commitment will make Google "the world's largest corporate buyer of renewable power, with commitments reaching 2.6 gigawatts (2,600 megawatts) of wind and solar energy".
|
|